{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "d50c9c47",
   "metadata": {},
   "source": [
    "# Data Preparation for Anthropic Claude-3 Haiku Fine-Tuning\n",
    "\n",
    "This notebook will guide you through the process of creating the necessary resources and preparing the datasets for fine-tuning the Anthropic Claude-3 Haiku model using Amazon Bedrock. By the end of this notebook, you will have created an IAM role, an S3 bucket, and training, validation, and testing datasets in the required format for the fine-tuning process."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f32bfc37",
   "metadata": {},
   "source": [
    "### Pre-requisites\n",
    "\n",
    "#### Custom job role\n",
    "\n",
    "The notebook allows you to either create a Bedrock role for running customization jobs in the **Create IAM customisation job role** section or you can skip this section and create Bedrock Service role for customization jobs following [instructions on managing permissions for customization jobs](https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-iam-role.html). If you want to using an existing custom job role please edit the variable **customization_role** and also ensure it has access to the S3 bucket which is created containing the dataset. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fcbafb9d",
   "metadata": {},
   "source": [
    "#### Create IAM Pre-requisites\n",
    "\n",
    "This notebook requires permissions to:\n",
    "\n",
    "- create and delete Amazon IAM roles\n",
    "- create, update and delete Amazon S3 buckets\n",
    "- access Amazon Bedrock\n",
    "\n",
    "If you are running this notebook without an Admin role, make sure that your role include the following managed policies:\n",
    "\n",
    "- IAMFullAccess\n",
    "\n",
    "- AmazonS3FullAccess\n",
    "\n",
    "- AmazonBedrockFullAccess\n",
    "\n",
    "You can also create a custom model in the Bedrock console following the instructions [here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-submit.html)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a594ab9e",
   "metadata": {},
   "source": [
    "### Setup\n",
    "\n",
    "Install and import all the needed libraries and dependencies to complete this notebook.\n",
    "\n",
    "<div class=\"alert alert-block alert-warning\">\n",
    "<b>Warning:</b> Please ignore error messages related to pip's dependency resolver.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "93e28916",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "!pip install --upgrade pip\n",
    "%pip install --no-build-isolation --force-reinstall \\\n",
    "    \"boto3>=1.28.57\" \\\n",
    "    \"awscli>=1.29.57\" \\\n",
    "    \"botocore>=1.31.57\"\n",
    "!pip install -qU --force-reinstall langchain typing_extensions pypdf urllib3==2.1.0\n",
    "!pip install -qU ipywidgets>=7,<8\n",
    "!pip install jsonlines\n",
    "!pip install datasets==2.15.0\n",
    "!pip install pandas==2.1.3\n",
    "!pip install matplotlib==3.8.2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e6958613",
   "metadata": {},
   "outputs": [],
   "source": [
    "# restart kernel for packages to take effect\n",
    "from IPython.core.display import HTML\n",
    "HTML(\"<script>Jupyter.notebook.kernel.restart()</script>\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "745c4b29",
   "metadata": {},
   "outputs": [],
   "source": [
    "import warnings\n",
    "warnings.filterwarnings('ignore')\n",
    "import json\n",
    "import os\n",
    "import sys\n",
    "import boto3 \n",
    "import time\n",
    "import pprint\n",
    "from datasets import load_dataset\n",
    "import random\n",
    "import jsonlines"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e2888c79",
   "metadata": {},
   "outputs": [],
   "source": [
    "session = boto3.session.Session()\n",
    "region = session.region_name\n",
    "sts_client = boto3.client('sts')\n",
    "account_id = sts_client.get_caller_identity()[\"Account\"]\n",
    "s3_suffix = f\"{region}-{account_id}\"\n",
    "bucket_name = f\"bedrock-haiku-customization-{s3_suffix}\"\n",
    "s3_client = boto3.client('s3')\n",
    "bedrock = boto3.client(service_name=\"bedrock\")\n",
    "bedrock_runtime = boto3.client(service_name=\"bedrock-runtime\")\n",
    "iam = boto3.client('iam', region_name=region)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c06e8486",
   "metadata": {},
   "outputs": [],
   "source": [
    "import uuid\n",
    "suffix = str(uuid.uuid4())\n",
    "role_name = \"BedrockRole-\" + suffix\n",
    "s3_bedrock_finetuning_access_policy=\"BedrockPolicy-\" + suffix\n",
    "customization_role = f\"arn:aws:iam::{account_id}:role/{role_name}\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b92c7dee",
   "metadata": {},
   "source": [
    "### Testing boto3 connection\n",
    "\n",
    "We will list the foundation models to test the boto3 connection and make sure bedrock client has been successfully created. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0b64fe61",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "for model in bedrock.list_foundation_models(\n",
    "    byCustomizationType=\"FINE_TUNING\")[\"modelSummaries\"]:\n",
    "    for key, value in model.items():\n",
    "        print(key, \":\", value)\n",
    "    print(\"-----\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "767e35bc",
   "metadata": {},
   "source": [
    "### Create S3 Bucket\n",
    "\n",
    "In this step we will create a S3 bucket, which will be used to store data for Claude-3 Haiku fine-tuning notebook. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "199c4502",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create S3 bucket for knowledge base data source\n",
    "s3bucket = s3_client.create_bucket(\n",
    "    Bucket=bucket_name,\n",
    "    ## Uncomment the following if you run into errors\n",
    "    # CreateBucketConfiguration={\n",
    "    #     'LocationConstraint':region,\n",
    "    # },\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "82a3ee8e",
   "metadata": {},
   "source": [
    "### Creating Role and Policies Required to Run Customization Jobs with Amazon Bedrock\n",
    "\n",
    "This JSON object defines the trust relationship that allows the bedrock service to assume a role that will give it the ability to talk to other required AWS services. The conditions set restrict the assumption of the role to a specfic account ID and a specific component of the bedrock service (model_customization_jobs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f44a290a",
   "metadata": {},
   "outputs": [],
   "source": [
    "ROLE_DOC = f\"\"\"{{\n",
    "    \"Version\": \"2012-10-17\",\n",
    "    \"Statement\": [\n",
    "        {{\n",
    "            \"Effect\": \"Allow\",\n",
    "            \"Principal\": {{\n",
    "                \"Service\": \"bedrock.amazonaws.com\"\n",
    "            }},\n",
    "            \"Action\": \"sts:AssumeRole\",\n",
    "            \"Condition\": {{\n",
    "                \"StringEquals\": {{\n",
    "                    \"aws:SourceAccount\": \"{account_id}\"\n",
    "                }},\n",
    "                \"ArnEquals\": {{\n",
    "                    \"aws:SourceArn\": \"arn:aws:bedrock:{region}:{account_id}:model-customization-job/*\"\n",
    "                }}\n",
    "            }}\n",
    "        }}\n",
    "    ]\n",
    "}}\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a6f6ccc2",
   "metadata": {},
   "source": [
    "This JSON object defines the permissions of the role we want bedrock to assume to allow access to the S3 bucket that we created that will hold our fine-tuning datasets and allow certain bucket and object manipulations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1259a923",
   "metadata": {},
   "outputs": [],
   "source": [
    "ACCESS_POLICY_DOC = f\"\"\"{{\n",
    "    \"Version\": \"2012-10-17\",\n",
    "    \"Statement\": [\n",
    "        {{\n",
    "            \"Effect\": \"Allow\",\n",
    "            \"Action\": [\n",
    "                \"s3:AbortMultipartUpload\",\n",
    "                \"s3:DeleteObject\",\n",
    "                \"s3:PutObject\",\n",
    "                \"s3:GetObject\",\n",
    "                \"s3:GetBucketAcl\",\n",
    "                \"s3:GetBucketNotification\",\n",
    "                \"s3:ListBucket\",\n",
    "                \"s3:PutBucketNotification\"\n",
    "            ],\n",
    "            \"Resource\": [\n",
    "                \"arn:aws:s3:::{bucket_name}\",\n",
    "                \"arn:aws:s3:::{bucket_name}/*\"\n",
    "            ]\n",
    "        }}\n",
    "    ]\n",
    "}}\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a57d8c3e",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = iam.create_role(\n",
    "    RoleName=role_name,\n",
    "    AssumeRolePolicyDocument=ROLE_DOC,\n",
    "    Description=\"Role for Bedrock to access S3 for haiku finetuning\",\n",
    ")\n",
    "pprint.pp(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "39f9b621",
   "metadata": {},
   "outputs": [],
   "source": [
    "role_arn = response[\"Role\"][\"Arn\"]\n",
    "pprint.pp(role_arn)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0f1bfe2a",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = iam.create_policy(\n",
    "    PolicyName=s3_bedrock_finetuning_access_policy,\n",
    "    PolicyDocument=ACCESS_POLICY_DOC,\n",
    ")\n",
    "pprint.pp(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "588bf683",
   "metadata": {},
   "outputs": [],
   "source": [
    "policy_arn = response[\"Policy\"][\"Arn\"]\n",
    "pprint.pp(policy_arn)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f36dca43",
   "metadata": {},
   "outputs": [],
   "source": [
    "iam.attach_role_policy(\n",
    "    RoleName=role_name,\n",
    "    PolicyArn=policy_arn,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "580b0cf0",
   "metadata": {},
   "source": [
    "### Prepare Dataset for Claude-3 Haiku fine-tuning and Evaluation\n",
    "\n",
    "The dataset that will be used is a collection of messenger-like conversations with summaries. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "698f48aa",
   "metadata": {},
   "outputs": [],
   "source": [
    "#Load samsum dataset from huggingface\n",
    "dataset = load_dataset(\"knkarthick/samsum\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e15c2ad8",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(dataset)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a8c8c1c9",
   "metadata": {},
   "source": [
    "To fine-tune the Claude-3 Haiku model, the training data must be in `JSONL (JSON Lines)` format, where each line represents a single training record. Specifically, the training data format aligns with the [MessageAPI](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-anthropic-claude-messages.html):\n",
    "\n",
    "<pre style=\"background-color: #e0e0e0;\">\n",
    "{\"system\": string, \"messages\": [{\"role\": \"user\", \"content\": string}, {\"role\": \"assistant\", \"content\": string}]}\n",
    "{\"system\": string, \"messages\": [{\"role\": \"user\", \"content\": string}, {\"role\": \"assistant\", \"content\": string}]}\n",
    "{\"system\": string, \"messages\": [{\"role\": \"user\", \"content\": string}, {\"role\": \"assistant\", \"content\": string}]}\n",
    "</pre>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9faaadf9",
   "metadata": {},
   "source": [
    "In each line, the `system` message is optional information, which is a way of providing context and instructions to Haiku model, such as specifying a particular goal or role, and often known as [system prompt](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/system-prompts). \n",
    "The `user` input corresponds to the user’s instruction, and the `assistant` input is the desired response that the fine-tuned Haiku model should provide. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c3f63e13",
   "metadata": {},
   "source": [
    "A common prompt structure for instruction fine-tuning includes a system prompt, an instruction, and an input which provides additional context. Here we define the system prompt which will be added to the MessageAPI, and an intruction header that will be added before each article and together will be the user content of each datapoint."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b27cd57e",
   "metadata": {},
   "outputs": [],
   "source": [
    "system_string = \"Below is an intruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0e706690",
   "metadata": {},
   "outputs": [],
   "source": [
    "instruction = \"\"\"instruction:\n",
    "\n",
    "Summarize the conversation provided below.\n",
    "\n",
    "input:\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7c18cf62",
   "metadata": {},
   "source": [
    "For the 'assistant' component we will refer the summary/highlights of the article. The transformation of each datapoint is performed with the code below"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "53502dcf",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Process the training dataset\n",
    "datapoints_train=[]\n",
    "for dp in dataset['train']:\n",
    "    temp_dict={}\n",
    "    temp_dict[\"system\"] = system_string\n",
    "    temp_dict[\"messages\"] = [\n",
    "        {\"role\": \"user\", \"content\": instruction+dp['dialogue']},\n",
    "        {\"role\": \"assistant\", \"content\": dp['summary']}\n",
    "    ]\n",
    "    datapoints_train.append(temp_dict)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0c717740",
   "metadata": {},
   "source": [
    "An example of a processed datapoint can be printed below"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b756ae7b",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(datapoints_train[4])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "329af4ac",
   "metadata": {},
   "source": [
    "The same processing is done for the validation and test sets as well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0fdb8697",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Process validation and test sets\n",
    "datapoints_valid=[]\n",
    "for dp in dataset['validation']:\n",
    "    temp_dict={}\n",
    "    temp_dict[\"system\"] = system_string\n",
    "    temp_dict[\"messages\"] = [\n",
    "        {\"role\": \"user\", \"content\": instruction+dp['dialogue']},\n",
    "        {\"role\": \"assistant\", \"content\": dp['summary']}\n",
    "    ]\n",
    "    datapoints_valid.append(temp_dict)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dc2fb7d5",
   "metadata": {},
   "outputs": [],
   "source": [
    "datapoints_test=[]\n",
    "for dp in dataset['test']:\n",
    "    temp_dict={}\n",
    "    temp_dict[\"system\"] = system_string\n",
    "    temp_dict[\"messages\"] = [\n",
    "        {\"role\": \"user\", \"content\": instruction+dp['dialogue']},\n",
    "        {\"role\": \"assistant\", \"content\": dp['summary']}\n",
    "    ]\n",
    "    datapoints_test.append(temp_dict)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1ac5f18f",
   "metadata": {},
   "source": [
    "Here we define some helper functions to process our datapoints further by modifying the number of datapoints we want to include in each set and the max string length of the datapoints we want to include. The final function will convert our datasets into JSONL files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "30314e3e",
   "metadata": {},
   "outputs": [],
   "source": [
    "def dp_transform(data_points,num_dps,max_dp_length):\n",
    "    \"\"\"\n",
    "    This function filters and selects a subset of data points from the provided list based on the specified maximum length \n",
    "    and desired number of data points.\n",
    "    \"\"\" \n",
    "    lines=[]\n",
    "    for dp in data_points:\n",
    "        if len(dp['system']+dp['messages'][0]['content']+dp['messages'][1]['content'])<=max_dp_length:\n",
    "            lines.append(dp)\n",
    "    random.shuffle(lines)\n",
    "    lines=lines[:num_dps]\n",
    "    return lines"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c8794700",
   "metadata": {},
   "outputs": [],
   "source": [
    "def jsonl_converter(dataset,file_name):\n",
    "    \"\"\"\n",
    "    This function writes the provided dataset to a JSONL (JSON Lines) file.\n",
    "    \"\"\"\n",
    "    print(file_name)\n",
    "    with jsonlines.open(file_name, 'w') as writer:\n",
    "        for line in dataset:\n",
    "            writer.write(line)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4254d440",
   "metadata": {},
   "source": [
    "Claude-3 Haiku fine-tuning has following requirements on your datasets:\n",
    "\n",
    "- Context length can be up to 32,000 tokens\n",
    "- Training dataset can not have greater than 10,000 records\n",
    "- Validation dataset can not have great than 1,000 records\n",
    "\n",
    "For simplicity, we will process the datasets as follow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "eb2e5485",
   "metadata": {},
   "outputs": [],
   "source": [
    "train=dp_transform(datapoints_train,1000,20000)\n",
    "validation=dp_transform(datapoints_valid,100,20000)\n",
    "test=dp_transform(datapoints_test,10,20000)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e1980b82",
   "metadata": {},
   "source": [
    "### Create Local Directory for Datasets"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b4822977",
   "metadata": {},
   "source": [
    "Save the processed data locally and convert them into JSONL formats"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "795299e9",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset_folder=\"haiku-fine-tuning-datasets-samsum\"\n",
    "train_file_name=\"train-samsum-1K.jsonl\"\n",
    "validation_file_name=\"validation-samsum-100.jsonl\"\n",
    "test_file_name=\"test-samsum-10.jsonl\"\n",
    "!mkdir haiku-fine-tuning-datasets-samsum\n",
    "abs_path=os.path.abspath(dataset_folder)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "389eed26",
   "metadata": {},
   "outputs": [],
   "source": [
    "jsonl_converter(train,f'{abs_path}/{train_file_name}')\n",
    "jsonl_converter(validation,f'{abs_path}/{validation_file_name}')\n",
    "jsonl_converter(test,f'{abs_path}/{test_file_name}')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "35ef2861",
   "metadata": {},
   "source": [
    "### Upload Datasets to S3 Bucket\n",
    "\n",
    "These code blocks upload the created training, validation and test datasets to S3 bucket. Training and validation datasets will be used for Haiku fine-tuning job, and testing dataset will be used to evaluate the performance between fine-tuned Haiku and base Haiku models. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5298c1c8",
   "metadata": {},
   "outputs": [],
   "source": [
    "s3_client.upload_file(f'{abs_path}/{train_file_name}', bucket_name, f'haiku-fine-tuning-datasets/train/{train_file_name}')\n",
    "s3_client.upload_file(f'{abs_path}/{validation_file_name}', bucket_name, f'haiku-fine-tuning-datasets/validation/{validation_file_name}')\n",
    "s3_client.upload_file(f'{abs_path}/{test_file_name}', bucket_name, f'haiku-fine-tuning-datasets/test/{test_file_name}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e3c21aaf",
   "metadata": {},
   "outputs": [],
   "source": [
    "s3_train_uri=f's3://{bucket_name}/haiku-fine-tuning-datasets/train/{train_file_name}'\n",
    "s3_validation_uri=f's3://{bucket_name}/haiku-fine-tuning-datasets/validation/{validation_file_name}'\n",
    "s3_test_uri=f's3://{bucket_name}/haiku-fine-tuning-datasets/test/{test_file_name}'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ea614c3e",
   "metadata": {},
   "source": [
    "### Storing Variables\n",
    "\n",
    "Please make sure to use the same kernel on fine-tuning Haiku notebook"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b714f27f",
   "metadata": {},
   "outputs": [],
   "source": [
    "%store role_arn\n",
    "%store bucket_name\n",
    "%store role_name\n",
    "%store policy_arn\n",
    "%store s3_train_uri\n",
    "%store s3_validation_uri\n",
    "%store s3_test_uri"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "67dce66e-6589-4d91-8ba6-c341c2dca3ff",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "availableInstances": [
   {
    "_defaultOrder": 0,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 4,
    "name": "ml.t3.medium",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 1,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.t3.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 2,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.t3.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 3,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.t3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 4,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.m5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 5,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.m5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 6,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.m5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 7,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.m5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 8,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.m5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 9,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.m5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 10,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.m5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 11,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.m5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 12,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.m5d.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 13,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.m5d.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 14,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.m5d.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 15,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.m5d.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 16,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.m5d.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 17,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.m5d.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 18,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.m5d.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 19,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.m5d.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 20,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": true,
    "memoryGiB": 0,
    "name": "ml.geospatial.interactive",
    "supportedImageNames": [
     "sagemaker-geospatial-v1-0"
    ],
    "vcpuNum": 0
   },
   {
    "_defaultOrder": 21,
    "_isFastLaunch": true,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 4,
    "name": "ml.c5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 22,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.c5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 23,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.c5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 24,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.c5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 25,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 72,
    "name": "ml.c5.9xlarge",
    "vcpuNum": 36
   },
   {
    "_defaultOrder": 26,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 96,
    "name": "ml.c5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 27,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 144,
    "name": "ml.c5.18xlarge",
    "vcpuNum": 72
   },
   {
    "_defaultOrder": 28,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.c5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 29,
    "_isFastLaunch": true,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.g4dn.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 30,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.g4dn.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 31,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.g4dn.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 32,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.g4dn.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 33,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.g4dn.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 34,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.g4dn.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 35,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 61,
    "name": "ml.p3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 36,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 244,
    "name": "ml.p3.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 37,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 488,
    "name": "ml.p3.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 38,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.p3dn.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 39,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.r5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 40,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.r5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 41,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.r5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 42,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.r5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 43,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.r5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 44,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.r5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 45,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 512,
    "name": "ml.r5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 46,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.r5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 47,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.g5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 48,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.g5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 49,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.g5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 50,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.g5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 51,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.g5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 52,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.g5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 53,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.g5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 54,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.g5.48xlarge",
    "vcpuNum": 192
   },
   {
    "_defaultOrder": 55,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 1152,
    "name": "ml.p4d.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 56,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 1152,
    "name": "ml.p4de.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 57,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.trn1.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 58,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 512,
    "name": "ml.trn1.32xlarge",
    "vcpuNum": 128
   },
   {
    "_defaultOrder": 59,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 512,
    "name": "ml.trn1n.32xlarge",
    "vcpuNum": 128
   }
  ],
  "instance_type": "ml.t3.medium",
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
